We have discussed statistical models. Now we want to study a family of models with a specific structure, and they have a lot of properties.
Exponential Family
is a parameter exponential family if it is defined by a family of densities of the form w.r.t a common dominating measure ()
The parts in the formula has distinct names:
: sufficient statistic,
: carrier density/base density,
: natural parameter,
: log-partition function.
By definition, take integral on both sides w.r.t over :
We should restrict , so define natural parameter space
We can prove that is convex function, so is convex set.
Example
Poisson distribution: recall for , it has density So take , and is an exponential family with sufficient statistic , base density and log-partition function .
The way of representing is not unique. Say ; or .
Poisson distribution with size : . So .
1.1 Distribution of
If w.r.t (WLOG we let , otherwise we let to absorb ), then w.r.t , where is the measure push forward through: So
This is simplest in discrete case (we can now drop assumption):
and we denote
1.2 Carnonical Form
Based on discussion above, we can simplify the structure of exponential family:
By (1.2) we have We can differentiate this function to get meaningful results. We use without proof that it's correct to swap differentiation and integral within .
2.1 Mean of
Denote be the th coordinate of . Then Rearrange for :
2.2 Variance of
Take a second partial derivative: Finally we get Here is a covariance matrix of the random vector .
Example
As we have shown above, in the Poisson exponential family , , . So
2.3 MGF of
Moment Generating Function (MGF) of a dimensional random vector is defined as . Note that 1-dim case is introduced in here. We can calculate moments by taking derivative of MGF, as long as is well-defined for a neighborhood of . Now we evaluate the first moments
Let , we obtain Similarly
On the other hand, given , we can explicitly calculate the MGF for exponential family
Example
Plug in this formula to the Poisson distribution, with , we have
To show it is useful, suppose , independent, and we want to determine distribution of . Then
As a result, we have .
2.4 CGF
The cumulant-generating function (CGF) is the log of MGF: . So for exponential family,
3 Other Parameterizations
Instead of parameterizing w.r.t , we can parameterize the family by another , so
Example
Poisson distribution: if indexed by the mean , with and is an example.
Normal: . With the usual parameter vector ,
so it is an exponential family with , , , and .
Or we can rewrite to complete the natural parameterization
Binomial: . So so we can take .
here is called logit/log-odds.
Beta: , then where is called beta function. So we can take .
4 Interpretation: Exponential Tilting
We can think of as an exponential tilt for the carrier :
Start with carrier .
Multiply by
Re-normalize by .
can be viewed as giving linear space of directions in which we can tilt . is all tilts after which normalization is possible (not going to infinity).
5 Repeated Sampling from Exponential Families
One of the most important properties of exponential families is that a large sample can be summarized by a low-dimensional statistic.
Suppose represents iid sample from an exponential family then This is an exponential family with sufficient statistic , base density and log-partition function .